Indexing and Retrieval of Multimodal Lecture Recordings from Open Repositories for Personalized Access in Modern Learning Settings
نویسندگان
چکیده
An increasing number of lecture recordings are available to complement face-to face and the more conventional content-based e-learning approaches. These recordings provide additional channels for remote students and time-independent access to the lectures. Many universities offer even complete series of recordings of hundreds of courses which are available for public access and this service provide added value for users outside the university. These lecture recordings show the use of a great variety of media or modalities (such as video, audio, lecture media, presentation behavior) and formats. Insofar, none of the existing systems and services have sufficient retrieval functionality or support appropriate interfaces to enable searching for lecture recordings over several repositories. This situation has motivated us to initiate research on a lecture recording indexing and retrieval system for knowledge transfer and learning activities in various settings. This system is built on our former experiences and prototypes developed within the MISTRAL research project. In this paper we outline requirements for an enhanced lecture recording retrieval system, introduce our solution and prototype, and discuss the initial results and findings. Introduction and Motivation The rapid evolution of communication technologies, advanced data networks and simplified processes of multimedia information processing have greatly impacted learning experiences. Multimedia resources have become increasingly important in the learning process. Not only are the multimedia resources included in short training videos and interactive learning objects but there are also infotainment resources as well as asynchronous and synchronous communication channels (Rajkumar, Gütl, & Ramadoss, 2008; Gütl, 2008). Since more appropriate technology has become available for recording, processing and distributing of multimodal or multimedia sources, lecture recordings have become important educational media types which have raised increasing interest within the last few years. Further information of multimedia resources and their application in e-learning can also be found in literature such as (Huang, Eze, & Webster, 2006). For more than ten years, various lecture and presentation recording systems have been researched and developed, such as Classroom 2000 (Abowd et al., 1996) and Authoring on the Fly (Bacher, Müller, Ottmann, & Will, 1997), Camstasia Studio (Camstasia Studio, 2008), E-Chalk (E-Chalk, 2008), Lecturnity (Lecturnity, 2008), MIT Lecture Browser (MITNEWS, 2007), TeleTechingTool (Ziewer & Seidl, 2002), and virtPresenter (Mertens, Ketterl, & Vornberger, 2007). Consequently, an increasing number of lecture recordings are available in order to complement face-to face and the more conventional content-based e-learning approaches but also to provide additional channels for remote students and time-independent access to the lectures. A majority of educational institutions provide only restricted access to their students. However, some universities offer a complete series of recordings of courses which are available for public access. Courses such as MIT Open Courseware (MITOCW), Carnegie Mellon University Open Learning Initiative (OLI) and Berkeley courses on YouTube (BERKELY, 2007) are available for public access. These open accessible lecture recordings not only support the institutions’ own students but they also provide valuable support to other students. These recordings can be used in various learning settings such as self-directed learning, vocational training and life-long learning activities and can also help other teachers or communities in developing countries. From the institutions’ and teachers’ point of view, the lecture recordings which are time independent provide additional transmission channel and they enable remote students or remote guest lecturers to integrate in the learning setting. The free access to other institutions’ lecture recordings can also support a plan for a new curriculum, prepare courses and lectures, and to get other viewpoints of specific subjects. From the students’ point of view, lecture recording of courses they are enrolled in enable the students to “consume” lectures they have missed, repeat difficult parts of the lectures and to prepare for examination. Available lecture recordings from other institutions can also help students to get complementary viewpoints of the same topic. The students can also use the recordings as complementary learning content especially if the course is presented by a highly specialized expert. Further details on teachers’ and students’ viewpoints can be found elsewhere in Spinola (2008). The above mentioned lecture recording systems provide a great variety of media types and modalities, which include one or several audio and video streams from the presenter, presentation media (e.g. presentation slides, white-board tools or other computer-based application), and back channels (e.g. audience or remote participants), speech-to-text transcripts or extraction, face recognition as well as face expression and gesture recognition, other modalities from interaction with presentation media, online and offline annotations from students, teachers and communities, related documents (e.g. presentation slides, lecture notes, exercises, online tests, and other background knowledge), or further human-based or computer-based semantic annotations and enrichment (e.g. concept extraction, topic extraction, presentation type). Our literature review has shown, that (1) lecture recording systems focus on different media according to their main purpose and objective, (2) accessible lecture recordings are available in various media formats and practically none of the systems provide all of the above mentioned media types and modalities, and (3) practically none of the existing systems and services have sufficient retrieval functionality and they do not support appropriate interfaces to enable searching of several repositories. The findings stated above have motivated us to initiate research on a lecture recording indexing and retrieval system for knowledge transfer and learning activities in various settings, such as in higher education, vocational training and life-long learning. This research is built on our experiences and prototypes developed within the MISTRAL research project (Gütl, 2008) which has focused on an enhanced multimodal information system for meeting scenarios. The remainder of this paper is organized as follows: Chapter 2 outlines requirements for an enhanced lecture recording indexing and retrieval system, followed by a brief introduction into the MISTRAL System in Chapter 3. Based on that the adaptation for the application of lecture recordings is discussed in Chapter 4, and finally lessons learned are given in Chapter 5. Requirements for an Enhanced Lecture Recording Indexing and Retrieval System Within our research initiative on lecture recording systems, Spinola (2008) discusses in detail requirements for an enhanced lecture or presentation recording system both for teachers and students. These requirements can be summarized as follows: • Manage and index lecture recordings as well as integrate relevant documents such as presentation slides, lecture notes, curricula and other background knowledge • Support various source formats and sets of media, fetch from various repositories, pre-process, convert and semantically enrich these media types to further build on unified internal representations and support the retrieval process and access to the data • Retrieve and access lecture recordings in various granularity (e.g. entire lecture, part of the lecture segmented by topics or subtopics or content based) • Provide access to various end devices (e.g. personal computer, PDAs and smart phones) in different modalities (e.g. video streams, only audio stream or text transcripts) and network infrastructure (e.g. broadband networks or wireless networks) The MISTRAL System at a Glance The MISTRAL system focused on technology-based methods for semantic annotation, extraction, indexing, retrieval and visualization of multimodal and multimedia data stream in the meeting application domain (GarcíaBarrios, & Gütl, 2006). According to the CAMIS model (Conceptual Architecture of Multimodal Information Systems) the conceptual units of such systems are: (1) Capturing (handles the provision of proper data streams from diverse sources for the further process chain), (2) Abstraction (deals with data processing and information extraction and may range from simple to more complex tasks; e.g. format conversion, compression or summarization, and information extraction at multiple levels of abstraction), (3) Fusion (merges and combines information from the unimodal data sources which may be performed on different levels of abstraction), (4) Storage (handles the persistent internal representation of the unimodal data streams and the extracted information but also manage the access and delivery of the data in a trustful and secure manner), (5) Retrieval (supports the process of finding relevant information or delivering useful data, manages browsing and searching in various modalities on different semantic levels and structure composites), and (6) Presentation (manages the combined and synchronized presentation of multimodal output data for information consumption). (Gütl, 2008b) The MISTRAL system is designed to deal with all aforementioned conceptual units in the meeting domain. Figure 1 outlines the MISTRAL architecture at a glance and shows the relations to the CAMIS units. On the outer left side of the diagram, the unimodal units for Audio, Text, Video and other environmental Sensors deal with capturing and abstraction within the CAMIS model. Based on the unimodal data processing, the Multimodal Merging unit combines extracted data on various abstraction levels based on semantic, spatial and temporal characteristics. Further information enrichment and contradiction checks on extracted information are performed based on domain knowledge within the Semantic Enrichment unit. Both the Multimodal Merging unit and the Semantic Enrichment unit address fusion tasks of the CAMIS model. The captured data from different modalities together with metadata as well as extracted and derived information on various semantic levels are managed and made accessible by the Data Repository unit. These units stated so far assemble the MISTRAL Core System. Different Semantic Applications can make use of the functionality of the core system and the great variety of available data on different semantic levels. Based on concrete application scenarios different aspects of retrieval and presentation related to the CAMIS model are addressed. (Gütl, 2008b)
منابع مشابه
International Journal of advanced studies in Computer Science and Engineering
Many organizations and universities provide distance learning by recording classroom lectures and making them available to students over the Internet. A repository generally contains hundreds of such lecture videos. Each lecture video is typically a more than hour’s duration and is often huge. It is sometimes clumsy for students to search through an entire video, or across many videos, in order...
متن کاملTeaming Up: Making the Most of Diverse Representations for a Novel Personalized Speech Retrieval Application
In addition to the increasing number of publicly available multimedia documents generated and searched every day, there is also a large corpora of personalized videos, images and spoken recordings, stored on users’ private devices and/or in their personal accounts in the cloud. Retrieving spoken items via voice commonly involves supervised indexing approaches such as large vocabulary speech rec...
متن کاملA Multimodal Approach toward Teaching for Transfer: A Case of Team-Teaching in ESAP Writing Courses
This paper presents a detailed examination of learning transfer from an English for Specific Academic Purposes course to authentic discipline-specific writing tasks. To enhance transfer practices, a new approach in planning writing tasks and materials selection was developed. Concerning the conventions of studies in learning transfer that acknowledge different learning preferences, the instruct...
متن کاملUser Interfaces for Speech-Based Retrieval of Lecture Recordings
Despite recent efforts in indexing and retrieving of audio data, common speech and audio search engines do return not only relevant results but also lots of documents that have no importance to the users. In addition, relevant information can be distributed over different parts of a file. Therefore, it is essential to represent the retrieval result in a way that enables users to easily filter r...
متن کاملPersonalized Access to Meeting Recordings for Knowledge Transfer and Learning Purposes in Companies
Meetings increasingly take place in our daily business life. Knowledge workers and mangers spend between 20 und 80 % of their working time in meetings. Consequently a huge amount of knowledge is addressed or even generated in such meetings. In light of that an important but also challenging task is to manage that knowledge as part of the corporate knowledge repository and make it accessible for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009